Near-Optimal BRL using Optimistic Local Transitions

نویسندگان

  • Mauricio Araya-López
  • Olivier Buffet
  • Vincent Thomas
چکیده

Model-based Bayesian Reinforcement Learning (BRL) allows a sound formalization of the problem of acting optimally while facing an unknown environment, i.e., avoiding the exploration-exploitation dilemma. However, algorithms explicitly addressing BRL suffer from such a combinatorial explosion that a large body of work relies on heuristic algorithms. This paper introduces bolt, a simple and (almost) deterministic heuristic algorithm for BRL which is optimistic about the transition function. We analyze bolt’s sample complexity, and show that under certain parameters, the algorithm is nearoptimal in the Bayesian sense with high probability. Then, experimental results highlight the key differences of this method compared to previous work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimistic Planning for the Near-Optimal Control of General Nonlinear Systems with Continuous Transition Distributions ⋆

Optimistic planning is an optimal control approach from artificial intelligence, which can be applied in receding horizon. It works for very general nonlinear dynamics and cost functions, and its analysis establishes a tight relationship between computation invested and near-optimality. However, there is no optimistic planning algorithm that searches for closed-loop solutions in stochastic prob...

متن کامل

BRL Quasi-Optimal à l’aide de Transitions Locales Optimistes

Résumé : L’apprentissage par renforcement bayésien basé modèle (BRL) permet une formalisation saine du problème consistant à agir optimalement face à un environnement inconnu, c’est-à-dire en évitant le dilemme exploration-exploitation. Toutefois, les algorithmes s’attaquant explicitement au BRL souffrent d’une telle explosion combinatoire qu’un grand nombre de travaux repose sur des algorithme...

متن کامل

Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure

The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor...

متن کامل

Chemical Equilibrium Mixture Computations for Energetic Material Combustion in Closed Vessels

A major computational code called CERV was developed to determine complex equilibrium compositions of a nonideal mixture of numerous imperfect gases and compressible liquid and solid species with phase transitions for closed-vessel applications. This code minimizes Gibbs energy using reaction variables, in contrast to other major codes like BRL-Blake, BRCBagheera and NASA-CEA that use compositi...

متن کامل

Tighter Value Function Bounds for Bayesian Reinforcement Learning

Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on modelbased BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1206.4613  شماره 

صفحات  -

تاریخ انتشار 2012